190-31: Estimate the False Discovery Rate Using SAS®
نویسنده
چکیده
This paper gives an exposition of recently developed methods of estimating the false discovery rate (FDR) under multiple comparisons and discusses their implementation using SAS. For example, microarray experiments typically involve tests of significance for hundreds or thousands of genes. For biologists confronted by this problem of multiplicity, the FDR is an appealing quantification of error. Recent literature establishes theoretical properties of an estimator for FDR associated with any chosen critical region. SAS code using the SAS/MACRO, SAS/STAT, and SAS/IML facilities is presented to compute this estimate, and appears in the appendix. Q-values, which are to FDR as p-values are to the type I error rate, are also obtained and used to reproduce plots generated by the R package “qvalue” that may be helpful for conducting multiple tests. Two approaches to estimating the proportion of true null hypotheses are considered. The first arises by choosing several values of a tuning parameter and fitting a smoothing spline (e.g. using PROC TPSPLINE ) to the resulting estimates. The second is to fit a two-component mixture to the sample of observed p-values via maximum likelihood (using PROC NLMIXED). A microarray example using p-values from tests on 384 genes is given.
منابع مشابه
The False Discovery Rate in Simultaneous Fisher and Adjusted Permutation Hypothesis Testing on Microarray Data
Background and Objectives: In recent years, new technologies have led to produce a large amount of data and in the field of biology, microarray technology has also dramatically developed. Meanwhile, the Fisher test is used to compare the control group with two or more experimental groups and also to detect the differentially expressed genes. In this study, the false discovery rate was investiga...
متن کاملSA 09 Estimating the False Discovery Rate using SAS
This paper gives an exposition of some recently developed methods of estimating the false discovery rate (FDR) in the setting of multiple comparisons and discusses their implementation with SAS and JMP. The FDR provides an interpretable quantification of error and an alternative to the familywise error rate that may be suitable in situations such as high throughput genomics experiments. SAS cod...
متن کاملFDR_TEST: A SAS Macro for Calculating New Methods of Error Control in Multiple Hypothesis Testing
The testing of multiple null hypotheses in a single study is a common occurrence in applied research. The problem of Type I error inflation or probability pyramiding in such contexts has been well-known for many years. General procedures for the control of Type I error rates in multiple testing are the Bonferroni procedure and its’ more recent modifications. These procedures partition a desired...
متن کاملLocal false discovery rate estimation using feature reliability in LC/MS metabolomics data
False discovery rate (FDR) control is an important tool of statistical inference in feature selection. In mass spectrometry-based metabolomics data, features can be measured at different levels of reliability and false features are often detected in untargeted metabolite profiling as chemical and/or bioinformatics noise. The traditional false discovery rate methods treat all features equally, w...
متن کاملA Note on Estimating the False Discovery Rate under Mixture Model
In this note, we focus on estimating the false discovery rate (FDR) of a multiple testing method with a common, non-random rejection threshold under a mixture model. We develop a new class of estimates of the FDR and prove that it is less conservatively biased than what is traditionally used. Numerical evidence is presented to show that the mean squared error (MSE) is also often smaller for the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006